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ANALYSIS OF THE TYPES OF PAUSES 
IN CSCL CHAT CONVERSATIONS 
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Abstract. The paper presents a research that is focused on considering the role of pauses 
in Computer-Supported Collaborative Learning chats. Several goals are pursuit, in the 
direction of analyzing cognitive and social aspects related to pauses in chats, and to 
identify criteria based on them for grading students. A classification of pauses in chats is 
introduced starting from their duration and adjacency pairs. Three chats were manually 
annotated and statistics were computed. Grading rules for students are proposed based 
on the types of pauses. 
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1. Introduction 


Exchanging information through dialogue is done at an ever accelerated pace in 
the daily life of this new century. Moreover, the road from simple to complex 
conversations has been opened in recent years due to the affordance of debates in 
the Web2.0 (Social Web) collaborative environments such as instant messenger 
(chat). 


Some main factors, which give contour to the exchange of information within the 
chat collaborative communication are the real time in which the conversation of 
the participants takes place, the rhythm and pauses in the flow of discussion. 


The main element of communication that underlies the chat type technology 
within the collaborative environment is to ensure the exchange of information 
between participants, based on the sequence of the utterances that in turn build the 
communication act [1]. Starting from the general advantages of chat conversations 
in the direction of encouraging collaboration we can emphasize the advantage of 
using chat in the educational area, giving to both students and teachers the 
possibility of a much faster learning and assessment. 


Chat conversations are a major ingredient of Computer Supported Collaborative 
Learning (CSCL). Several systems have been developed for providing analyses of 
interactions between participants, for example: Polyphony [2], PolyCAFe [3], 
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based on Latent Semantic Analysis (LSA), Natural Language Processing (NLP) 
[4, 5, 6] and Social Network Analysis (SNA) [6, 7]. However, as we know, the 
role of pauses in conversation has not been considered in previous approaches in 
CSCL. This paper tries to make some steps in the direction of filling this gap. 


The paper presents a research directed towards analysing the types of pauses in 
chat conversations, with the purpose of providing patterns for developing NLP 
tools for classifying pauses depending on the participant's utterances and 
adjacency pairs in the conversation. 


The starting point in the analysis is based on identifying the participants in a 
conversation, their utterances and the interval of time between the participants’ 
utterances. From the participant’s perspective, the analysis and classification of 
the pause patterns involve considering certain cognitive processes, an implication 
that can be seen quantitatively, qualitatively as well as socially. The participants 
provide various answers depending on the questions, can offer solutions, and all 
these aspects are fundamentally important in the construction and identification 
process of pauses. 


In our analysis we used conversations from Computer-Supported Collaborative 
Learning sessions in which students had the task of discussing about collaborative 
technologies ("chat", "wiki" , "blog", "forum") where each participant is intended 
to support his/her idea on this technology. In light of these conversations, the 
utterances and their utterances are the key points and with their help we will 
identify the types of pauses in chat conversations. The utterance is the first step in 
detecting pauses and for this we defined a coding (markup) of utterances on which 
we build the specific markup of pauses in chat conversations. The analysis of 
patterns of pause types for their identification is done on an XML coding of the 
chat logs, highlighting the types depending on the utterance type, the duration 
between utterances as well as the number of utterances. 


The paper continues with presenting Bakhtin’s concept of dialogism, followed by 
a section which contains an analysis and classification of types of pauses. The 
experiment is presented in the fourth section and in the following section is 
discussed how the results may be used for grading students. 


2. Concepts of dialogue in chat conversations 


Discourse analysis in our approach is based on Bakhtin's dialogical theory [8, 9, 20]. 
He considered that dialogical relations are "a much broader phenomenon than more 
rejoinders in a dialogue, laid out compositionally in the text; they are an almost 
universal phenomenon, permeating all human speech and all relationships and 
manifestations of human life—in general, everything that has meaning and 
significance." [20, 25]. 
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Dialogue is based on words, which are the essential element of both an online 
conversation and a face-to-face conversation. The exchange of words in 
conversations is constituted through utterances [9] and represents a bridge 
between the participants’ contributions. Utterances can be seen as linguistic 
actions whereby the key element is the word. Theories approaching these 
linguistic actions were described by both Mikhail Bakhtin and Ferdinand de 
Saussure. The difference between these theories is that Bakhtin considers words 
also as a kind of utterance, being filled with echoes of other utterances, while de 
Saussure considered words as arbitrary signs. 


Dialogue can be regarded as an exchange of utterances between several 
participants, each of such utterances being associated with one or more speech 
acts. Austin introduced the theory of speech acts that includes constative acts and 
performative acts, a development of this theory being done by Searle [16, 17]. 
Speech acts were associated with two classes of functions that represent the basics 
of DAMSL architecture (Dialog Act Markup in Several Layers) [21, 22], called 
anticipatory functions ("forward looking function") and regressive adaptation 
functions ("backward looking function"). The exchange of utterances associated 
with speech acts and the exchange of words associated with utterances form the 
central point of departure in analysing chat conversations. 


3. Description of the concept of pause. Types of pauses 


Silence can be a resource to communicate some elements of a problem which are not 
written easily. It can appear as: gaps/lacunae, interruptions (lapses) or pauses [14]. 
We will consider in this paper the concept of pause symbolized by the silence 
between the participants’ utterances in the conversation that, depending on the 
number of participants, may take different aspects. Important factors in describing the 
concept of pause are the number of participants, the time, the written text, and the 
type of speech act. The term “pause” started its influence as early as 1959 in the 
Anglo-Saxon literature when Maclay H and C.E.Osgood [13] describe it as the 
“hesitation phenomena”. We meet again this hesitation phenomenon ("Phenomene 
d‘hesitation") in the works of Maria Candea [18] described by the following terms: 
filled pauses ("pauses remplies"), elongated syllables ("syllabes allonges"), 
repetitions ("repetition" [19]) and false starts (“faux departs" [19]). 


Before discussing literature about pauses, we should mention that pauses are 
considered by researchers in different contexts: reading text, monologues, and 
conversations. Regarding the latter case, only face-to-face or phone conversations 
involving only two participants are usually taken into account. In recent years, due 
to its explosive usage, instant messenger (“chat”) should also be considered and 
important differences between these two types of conversations are present. This 
paper analyzes this second case. 
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We can highlight a first classification of pauses by recalling Linde's description 
that supports the idea that there are the following types of pauses: "extended 
pauses between 3-16 seconds, long pauses between 1-1.9 seconds, and short 
pauses between 0.1-0.6 seconds" [10]. Also, in a work of Kristine Fors [11] we 
find the idea that short pauses are greater in number. Starting from this 
classification, long pauses may be indicators of thinking and having as result the 
providing of better answers, reflected in the number of words explaining the ideas 
of a sentence. 


Short pauses are many times associated to the usage of short utterances. They 
have the role to allow sharing the sentence idea and underline a better control on 
the interaction between participants. 


In terms of utterances, we can say that the pauses between exchanges of 
utterances are the most frequent, followed by pauses with selection, that means 
one participant is explicitly selected to answer by an other, and then by pauses 
before the response given by the participant [12]. 


If we analyze a chat conversation, we can see the problems faced by the 
participants and we can enumerate some of the complex tasks they may face, such 
as: determining the intervention time, determining whether the speaker intends to 
continue the conversation, preparation of what themselves might say. From here 
we emphasize the idea that both chat and face-to-face conversations are 
characterized by pauses in which the main role is given to the length during 
pauses. However, in the instant messenger (chat) case, as compared to the face-to- 
face one, to the duration of pause we should add the duration of typing the text. 


Conversations logs help us define another category of pauses, with the participant 
in the lead role. The participant may be called to take over the conversation, to 
end a conversation or to intervene in a conversation. According to these moments 
of conversation, we may say that there are the following categories of pauses 
identified by Kristina Lundholm, and Jessica Villing [11]: 


a) ~pause internal within” 
b) pause internal between” 
c) pause initial” 


We conclude the classification of the types of pauses referring again to the 
concept of silence and define a "pause silencieuse" as any pause which includes: 
a) non-structuring pauses b) structuring pauses [18]. The difference between the 
two types of pauses lies in that the structuring pause is between two sound 
sequences emitted by the same person and preceded by a sound sequence like 
interjections, for example um, or the repetition of certain words, while a non- 
structuring pause is preceded by words repetitions, monosyllabic repetitions, and 
false starts [18]. 
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All these classifications of the types and features of pauses will be used in 
analysing conversations in collaborative environments also referring to the types 
of pauses associated to the code of utterances which will be described in the next 
section. A table of codes for each type of pause based on which the analysis of 
chat files will be done, will also be presented in the next section (Table 3). 


4. Experiment 


During chat conversations, a major factor is represented by the explicit references 
[27] and the identification of implicit references, which will help us analyse the 
evolution of the participants in the conversation, identify the types of pauses in the 
conversation, and analyse the participation degree of a student depending on the 
utterances used. This article aims to identify utterance-reply pairs (through 
explicit references) used by the participants, and based on them, to determine the 
types of utterances and the types of pauses depending on the utterance emitted by 
the participant. 


The experiment consisted in analysing the logs of three conversations of some 
computer science students debating, in an assignment at the Human-Computer 
Interaction course, about collaborative technologies (chat, forum, blog, wiki). 
Chat conversations are represented in XML files [2, 3, 7, 25]. The utterances that 
we will analyse are represented as: 


<Turn nickname="participant1"> 
<Utterance genid="11" 
time="03.23.34" 
ref="0"> 
well we are all here..can we start? 
</Utterance> 
</Turm> 


where: 


nickname is participant’s name; 

genid represents the unique id associated to the utterance; 

ref represents the reference to which the utterance explicitly refers to using the chat 
tool facilities [27]; in case its value is 0, it means that it did not refer to any utterance; 
time shows us the moment in which the utterance was written; 

the text that appears between the ‘Utterance’ tags represents of the utterance itself. 


A manual annotation of the chat log files has been done. Each chat had 
4-5 participants and an average of 100-450 utterances. 


Question type utterances and answer type utterances were identified, each being 
classified according to how one participant asked or a answered question. 
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Starting from a classification of the types of utterances presented in a previous 
section, we have identified three types of pauses specific to the utterances-replies 
pairs used: short, medium, and long. We have classified the three types of pauses 
through a manual analysis of the three files based on the number of utterances, the 
average time between consecutive utterances and the utterances used by a single 
participant. 


We present below ways of grading and evaluating the participant in a 
conversation carried in a collaborative environment (chat). 


5. Evaluating and grading the participants in a chat conversation 


We can look to words, groups of words, phrases, interjections, and symbols 
rendered by a question or an answer in the form of utterances that determine 
intent, mental state or other feelings of the one in conversation. 


The utterances generated during conversations may refer implicitly or explicitly to 
previous ones, distinguished through co-references, repetitions, lexical chains, 
[14] inter-animation [15] in the case of implicit references and the possibility of 
referral in the case of explicit ones (through the facilities of some chat 
environments, such as ConcertChat [27]). 


A factor determining student’s contribution in the chat conversation can be the 
types of her utterances, which can have a positive or a negative aspect, and which 
reflect student’s contributions. Therefore, the participants’ assessment method is 
carried out by means of the set of utterances used. The student can answer to a 
participant, ask a question or continue an idea, all of which being important in 
building the utterance set. We can achieve another criterion for determining the 
level of a participant grading starting from how students support their ideas on the 
technology chosen. 


The utterances that have an explicit reference to a previous utterance indicate that 
the participant uses that reference to support or criticise that idea, that it is a 
continuation of an idea, showing that there is strong communication between 
participants and also a very good criterion for student’s assessment. We can also 
consider the utterance that is referenced by several utterances as a significant 
utterance. 


Besides the communication between participants, there are other factors involved 
in students’ assessment, evidenced by the number of utterances exchanged, the 
type of utterances used, the total number of pauses, pause type, as well as by the 
utterance structure made of the number of utterances. The factors described 
represent the starting point in the analysis of participants, and the participants’ 
final grade is deduced from the factors stated. 


Analysis of the Types of Pauses in CSCL Chat Conversations 43 


In the assessment process we can take into account four important factors in the 
manual analysis of chat conversation (see Table 1). 


Table 1 
Factor Factor description 
Type of utterance Question, answer 
Structure of utterances Number of interchanged utterances 
Pause Time between utterances 
Structure of pauses Number of pauses, type of pauses 


The manual analysis is focused on two directions: on the one hand, on 
highlighting types of utterances and the other hand, on highlighting pauses. From 
the perspective of a quantitative analysis, it can easily be noticed the types of 
utterance used by the participant, which may lead to a description of how to grade 
the participant. Another important aspect is the quantitative analysis of the 
number of pauses that also contribute to assess the student’s participation degree. 


In the assessment process of a student, the utterances that are explicitly referenced 
by the current utterance of the conversation and the utterance type were 
considered. The type of utterance and the type of pause are based on codes (mark- 
ups) of utterances and codes of the types of pauses. Each chat log file was 
analysed in terms of number and duration of pauses, yielding an average of the 
three files of 50.56 s, 66 s, and 64.39 s. 


Based on these values, we defined three categories of pauses: 
a) short pauses in the interval / s - 49 s; 

b) medium pauses in the interval 50 s - 66 s; 

c) long pauses for values greater than 67 s. 


The results are influenced by the nature of manual annotation process and factors 
involved: 


e Number of long pauses: 122, number of short pauses: 235, number of 
medium pauses: 61 


e Number of utterances, between 203-397 


e Participant’s scoring (number of utterances/participant’s utterance) 
between 2.57 and 7.52 


We continue the description of our analysis with the description of the code set for 
tagging the utterances (inspired from previous sets of codes [23, 24]) in Table 2. 
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Table 2 
Code Meaning 
General 
S Statement — affirmation 
RS Regulate — to introduce a rule 
Questions 


QY Yes-no question 


Whz-question - Question with an answer other than yes/no (when, who, 
where, etc.) 


QR Or/or-clause question (question ”... or... or ...””) 


QH_ | Rhetorical question 


QO | Open ended question 


R Request 


O Offer (for example, a solution, an idea) 


Answers 


YN Short Y/N answer: yes, yep, no, ok, k etc. 


A Agree- Acceptance, confirmation with a longer answer than Y/N 

D Disagree - Non-acceptance, negation with a longer answer than Y/N 

C Critique 

E Explanation 

RE Repair, correction 

RS Respond, more general than the codes below that are tied to problem 
solving 

F Continuation, follow 


EL Elaboration, development 


EX Extension (for example, of a question) 


U Uncertain response 
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Starting from this code set, pause coding is derived. The type of pause (short, 
medium, and long) is indicated as a prefix to the actual code. For example, PS- 
YN means a short pause for an Y/N answer (yes, yep, no, ok, k, etc.), PM-YN a 
medium pause, and PL-YN a long pause for the same type of answer. 


In Table 3 is presented a fragment of one of the log files containing the name of 
the utterer, the text of the utterance, its number, the referred utterance, its time 
stamp, the duration between consecutive utterances, and the pause between an 
utterance and the referenced utterance (if it exist an explicit reference). 


After the manual analysis considering the codes described above, the utterance 
“and so, it can be confusing” has associated the code C (critique), the duration 
between utterances is 11 s, which leads us to considering it as a short critique 
pause. 


The utterance ”in blogging however, only allowed users can post, and that makes 
it more accurate” is a utterance with code E (explain), the duration between 
utterances is 36 s and we have a short explain pause. 


Table 3 
No. of ; Inter- 
Nate of Utterance oe referred rue utterance | Pause 
Participant no. stamp ; 
utt. duration 
Liviu yes, but wiki has a major 40 36 03.28 15 
problem 21 
the major problem of wiki 03.28 
Liviu is that too many people can 41 49 28 
change the content ; 
aay : : 03.29 
Liviu and so, it can be confusing 42 41 00 11 11 
yes, but not "everybody" is 03.35 
Liviu smart or capable of editing 719 76 : 31 
: AT 
or adding valuable content 
dragos, if i write something, 03.35 
Andreea somebody can come and 80 5] 4 
edit what i wrote, true? y 
dragos well yes.... 81 ae 15 
Andreea well that's bad 82 ne 8 
dragos not really 83 ae 6 
in blogging however, only 
a allowed users can post, and 03.36 
Pay that makes it more a 2 23 ? 26 
accurate 
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The diagram in Figure | shows the participants in the conversation, the number of 
utterances and the number of pauses. 
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Fig. 1. Representation pattern of the number of 
utterances and pauses / participant 


Based on the obtained results, the analysis of all files with chat logs was done, 
obtaining an assessment and grading of participants. 


On a scale from 1 to 5 we have considered the following grading categories: 


1 — INSUFFICIENT, 
2- SUFFICIENT, 

3 - MEDIUM, 

4 - GOOD, 

5 — VERY GOOD. 


In this classification, the types of utterances and pauses were taken into account. 
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A tentative grading scheme that we propose is to assign a VERY GOOD scoring 
to a participant if a great number of short pauses combined with agreement / 
affirmation utterances appear in the conversation. GOOD is given for short 
pauses and Y/N utterances, explanation and continuation, MEDIUM for medium 
pauses and utterances of agreement and explanation, SUFFICIENT for long 
pauses and utterances of explanation and agreement, and INSUFFICIENT for 
long pauses and utterances of continuation and non-agreement (see Table 4). 


Table 4 
Grading type of participant Grading characteristics 
I/INSUFICIENT Long pauses — continuation — non-agreement utterances 
2/SUFICIENT Long pauses — explanation — agreement utterances 
3/MEDIUM Medium pauses — agreement— explanation utterances 
4/GOOD Short pauses — agreement — explanation utterance 
5/VERY GOOD Short pauses — agreement — affirmation utterance 


Another aspect that we have taken into account in the manner of grading 
classification was the coverage percentage of these pause types specific to 
utterances. 


For example, a percentage with the highest value is found in the short affirmation 
pauses 80%, followed by the medium agreement pauses 45%, long agreement 
pauses 38%, short agreement pauses 31%. 


The lowest values are found in the case of medium continuation pauses 9%, 
medium explanatory pauses 11%, uncertain short pauses 11%, short explanatory 
pauses 13%. 


6. Conclusions 


This paper aims to assess the participants’ contribution in a collaborative 
environment by creating a manual method, which can be automated, which 
considers the numbers and types of utterances and pauses. From the dialogism 
perspective, we analysed the texts of the conversations and outlined the 
commitment levels of each participant in the conversation, participant’s 
assessment and grading in collaborative terms. 


Computer-Supported Collaborative Learning [26] offers, besides the possibility of 
effective communication between students, the possibility of assessment and 
grading of the participant in conversation. 
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In this paper we have taken into account utterances that have explicit references 
and, based on adjacency pairs of such utterances, we have experimented how, 
using a manual annotation, an analysis of the types of pauses can be performed. 
We have also established a grading level of participants in the conversation 
starting from utterance numbers and types of pauses. 


In the future, we will bring contributions by improving the possibility to analyze 
the entire corpus of chat we have developed in recent years, to identify all types of 
pauses and all utterances for the entire corpus, and to develop and asses the 
grading criteria based on pauses. 
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